15 research outputs found

    Hybrid models for combination of visual and textual features in context-based image retrieval.

    Get PDF
    Visual Information Retrieval poses a challenge to intelligent information search systems. This is due to the semantic gap, the difference between human perception (information needs) and the machine representation of multimedia objects. Most existing image retrieval systems are monomodal, as they utilize only visual or only textual information about images. The semantic gap can be reduced by improving existing visual representations, making them suitable for a large-scale generic image retrieval. The best up-to-date candidates for a large-scale Content-based Image Retrieval are models based on the Bag of Visual Words framework. Existing approaches, however, produce high dimensional and thus expensive representations for data storage and computation. Because the standard Bag of Visual Words framework disregards the relationships between the histogram bins, the model can be further enhanced by exploiting the correlations between the visual words. Even the improved visual features will find it hard to capture an abstract semantic meaning of some queries, e.g. straight road in the USA. Textual features, on the other hand, would struggle with such queries as church with more than two towers as in many cases the information about the number of towers would be missing. Thus, both visual and textual features represent complementary yet correlated aspects of the same information object, an image. Existing hybrid approaches for the combination of visual and textual features do not take these inherent relationships into account and thus the combinations performance improvement is limited. Visual and textual features can be also combined in the context of relevance feedback. The relevance feedback can help us narrow down and correct the search. The feedback mechanism would produce subsets of visual query and feedback representations as well as subsets of textual query and textual feedback representations. A meaningful feature combination in the context of relevance feedback should take the inherent inter (visual-textual) and intra (visual-visual, textualtextual) relationships into account. In this work, we propose a principled framework for the semantic gap reduction in large scale generic image retrieval. The proposed framework comprises development and enhancement of novel visual features, a hybrid model for the visual and textual features combination, and a hybrid model for the combination of features in the context of relevance feedback, with both fixed and adaptive weighting schemes (importance of a query and its context). Apart from the experimental evaluation of our models, theoretical validations of some interesting discoveries on feature fusion strategies were also performed. The proposed models were incorporated into our prototype system with an interactive user interface

    RGU at ImageCLEF2010 Wikipedia Retrieval Task

    Get PDF
    This working notes paper describes our first participation in the ImageCLEF2010 Wikipedia Retrieval Task. In this task, we mainly test our Quantum Theory inspired retrieval function on cross media retrieval. Instead of heuristically combining the ranking scores independently from different media types, we develop a tensor product based model to represent textual and visual content features of an image as a non-separable composite system. Such system incorporates the statistical/semantic dependencies between certain features. Then the ranking scores of the images are computed in a way as quantum measurement does. Meanwhile, we also test a new local feature that we have developed for content based image retrieval

    Improving content based image retrieval by identifying least and most correlated visual words

    Get PDF
    In this paper, we propose a model for direct incorporation of im- age content into a (short-term) user profile based on correlations between visual words and adaptation of the similarity measure. The relationships between visual words at different contextual levels are explored. We introduce and compare var- ious notions of correlation, which in general we will refer to as image-level and proximity-based. The information about the most and the least correlated visual words can be exploited in order to adapt the similarity measure. The evaluation, preceding an experiment involving real users (future work), is performed within the Pseudo Relevance Feedback framework. We test our new method on three large data collections, namely MIRFlickr, ImageCLEF, and a collection from British National Geological Survey (BGS). The proposed model is computation- ally cheap and scalable to large image collections

    Early fusion and query modification in their dual late fusion forms.

    Get PDF
    In this paper, we prove that specific widely used models in Content-based Image Retrieval for information fusion are interchangeable. In addition, we show that even advanced, non-standard fusion strategies can be represented in dual forms. These models are often classified as representing early or late fusion strategies. We also prove that the standard query modification method with specific similarity measurements can be represented in a late fusion form

    Twitter response to televised political debates in Election 2015.

    Get PDF
    The advent of social media such as Twitter has revolutionised our conversations about live television events. In the days before the Internet, conversation about television programmes was limited to those sitting on the sofa with you and people you met the next morning – so-called ‘water-cooler conversation’. Now, however, it is possible to discuss events on the screen in real time with people all over the country – three out of five UK twitter users tweet while watching television (Nielsen, 2013). Thus it is not surprising to find that the General Election’s television events generated debate and discussion on twitter

    Novel local features with hybrid sampling technique for image retrieval

    No full text
    In image retrieval, most existing approaches that incorporate local features produce high dimensional vectors, which lead to a high computational and data storage cost. Moreover, when it comes to the retrieval of generic real-life images, randomly generated patches are often more discriminant than the ones produced by corner/blob detectors. In order to tackle these problems, we propose a novel method incorporating local features with a hybrid sampling (a combination of detector-based and random sampling). We take three large data collections for the evaluation: MIRFlickr, ImageCLEF, and a collection from British National Geological Survey. The overall performance of the proposed approach is better than the performance of global features and comparable with the current state-of-the-art methods in content-based image retrieval. One of the advantages of our method when compared with others is its easy implementation and low computational cost. Another is that hybrid sampling can improve the performance of other methods based on the "bag of visual words" approach
    corecore